Path-Synchronous Performance Monitoring in HPC Interconnection Networks with Source-Code Attribution
نویسندگان
چکیده
Performance anomalies involving interconnection networks have largely remained a “black box” for developers relying on traditional CPU profilers. Network-side profilers collect aggregate statistics and lack source-code attribution. We have incorporated an effective protocol extension in the Gen-Z communication protocol for tagging network packets in an interconnection network; additionally, we have backed the protocol extension with hardware and software enhancements that allow tracking the flow of a network transaction through every hop in the interconnection network and associate it back to the application source code. The result is a first-of-its-kind hardware-assisted telemetry of disparate, autonomous interconnection networking components with application source code association that offers better developer insights. Our scheme works on a sampling basis to ensure low runtime overhead and generates modest volumes of data. Simulation of our methods in the open-source Structural Simulation Toolkit (SST/Macro) shows its effectiveness— deep insights into the underlying network details to the developer at minimal overheads.
منابع مشابه
IMORC: An infrastructure for performance monitoring and optimization of reconfigurable computers
For many years academic research has studied the use of application-specific coprocessors based on field-programmable gate arrays (FPGAs) to accelerate high-performance computing (HPC) applications. Since major supercomputer vendors now provide servers with integrated reconfigurable accelerators, this technology is available to a much broader group of users. Still, designing an accelerator and ...
متن کاملAttribution Bias in schizophrenian patients who have auditory hallucination
Introduction: Concerning cognitivism, psychotic experiences (hallucination) of schizophrenic patiets have been hypothesized to originate from a fundamentally cognitive biases. Methods: To explor the idea that attribution bias may underlin appearance of auditory hallucination, in the current descriptive study, a source-monitoring task were used to compare healthy controles with relatives of indi...
متن کاملPerformance Analysis of a New Neural Network for Routing in Mesh Interconnection Networks
Routing is one of the basic parts of a message passing multiprocessor system. The routing procedure has a great impact on the efficiency of a system. Neural algorithms that are currently in use for computer networks require a large number of neurons. If a specific topology of a multiprocessor network is considered, the number of neurons can be reduced. In this paper a new recurrent neural ne...
متن کاملPerformance Analysis of a New Neural Network for Routing in Mesh Interconnection Networks
Routing is one of the basic parts of a message passing multiprocessor system. The routing procedure has a great impact on the efficiency of a system. Neural algorithms that are currently in use for computer networks require a large number of neurons. If a specific topology of a multiprocessor network is considered, the number of neurons can be reduced. In this paper a new recurrent neural ne...
متن کاملEvaluation of HPC architectures for BRAMS numerical weather model
This paper investigates the performance of a weather forecasting application (Brazilian Regional Atmospheric Modeling System BRAMS) on a number of selected HPC clusters in order to understand the impact of different architectural configurations on its performance and scalability. We simulated atmosphere conditions over South America for 24 hours ahead with BRAMS, using 100 cores as a starting p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017